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Abstract 

In this work, the beta-decay halflives problem is dealt as a nonlinear optimiza- 
tion problem, which is resolved in the statistical framework of Machine Learning 
(LM). Continuing past similar approaches, we have constructed sophisticated Arti- 
ficial Neural Networks (ANNs) and Support Vector Regression Machines (SVMs) 
for each class with even-odd character in Z and N to global model the systemat- 
ics of nuclei that decay 100% by the (5~-mode in their ground states. The arising 
large-scale lifetime calculations generated by both types of machines are discussed 
and compared with each other, with the available experimental data, with previous 
results obtained with neural networks, as well as with estimates coming from tradi- 
tional global nuclear models. Particular attention is paid on the estimates for exotic 
and halo nuclei and we focus to those nuclides that are involved in the r-process nu- 
cleosynthesis. It is found that statistical models based on LM can at least match or 
even surpass the predictive performance of the best conventional models of /3-decay 
systematics and can complement the latter. 
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1 Introduction 



Reliable, quantitative estimates of /9~-decay halflives of nuclei far from sta- 
bility are needed by the experimental exploration of the nuclear landscape at 
existing and future radioactive ion-bcam facilities and by ongoing major ef- 
forts in astrophysics towards understanding of supernova explosions, and the 
processes of nucleosynthesis, notably the r-process [1] . In the nuclear chart 
there are spaces for some 6000 nuclides between the /3-stabihty hue and the 
neutron-drip line. Although, /3~-decay properties have been already measured 
in terrestrial laboratories for some r-proccss key nuclei and more will be mea- 
sured in future facilities, the majority of /5-decay rates of the involved neutron- 
rich nuclei should be estimated from theoretical models. Several approaches 
of different level of sophistication for determining /3-halfiives have been pro- 
posed and applied over the years. One can mention the more phenomeno- 
logical treatments based on Gross Theory (GT) [2], as well as microscopic 
treatments that employ the pn Quasiparticle Random-Phase Approximation 
{pn — QRPA) (in various versions) [3], [4] or the shell-model [5]. The latest 
hybrid version of the RPA models developed by MoUer and coworkers, com- 
bines the pn — QRPA model with the statistical Gross Theory of the first 
forbidden decay (pnQRPA-|-jQ^GT) [6]. There are also some models in which 
the ground state of the parent nucleus is described by the extended Thomas- 
Fermi plus Strutinsky integral method, or the Hartree-Fock BCS method, or 
other density functional method (DF) and which use the continuum QRPA 
(CQRPA) [7]. Recently relativistic pn - QRPA (RQRPA) models have been 
applied in the treatment of neutron-rich nuclei in the N ~ 50, N 82 and 
Z ~ 28 and 50 regions [8]. Despite continuing improvements the predictive 
power of these conventional "theory-thick" models is rather limited for /?~- 
decay halflives of nuclei that are mainly far from stability, with deviations from 
experiment of at least an order of magnitude and considerable sensitivity to 
quantities that are poorly known. 

The recent advances in Artiflcial Inteligence (AI) and especially in statisti- 
cal learning theory or Machine Learning (LM), notably Artificial Neural Net- 
works (ANNs) and Support Vector Machines (SVMs) provide, an alternative 
opportunity to develop statistical models of observables of different systems 
that exhibit significant power. These models are "theory-thin" and are driven 
mainly from the data. For example, in the case of nuclei, any nuclear observ- 
able X can be viewed as a mapping from the proton and neutron numbers Z 
and N {{Z,N)^ X). In LM one attempts to approximate the mapping by 
using only a subset of the data for X (training data). LM-based models have 
already been developed for several nuclear properties [9], including atomic 
masses and ground state spins and parities. In this work, which continues 
previous studies in statistical modeling of nuclear halfiife systematics [10-12], 
we present global models for the halflives of nuclear ground states that decay 
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100% by the f3~ mode (T^) using the ANN and SVM algorithms aiming to 
compare the two algorithms and to investigate further the potentiality of the 
modeling using learning machines. In Section 2 we briefly present the elements 
of the models. In Section 3 some of the results of the models are presented 
and evaluated and in Section 4 the conclusion and prospects of the present 
study are considered. 



2 The Models 



A Learning Machine consists of i) an input interface where, external, input 
variables to the device in coded form (for ex. Z and N) are fed , ii) a system 
of intermediate elements or units that process the input, and iii) an output in- 
terface where an estimate of the corresponding observable of interest (say the 
beta halflife Tp) appears for decoding. Given an adequate body of training data 
(learning set), a suitable training algorithm is used to adjust the parameters of 
the machine to produce good performance on this set and good generalization 
on test examples (test set) absent from the training set. The machine interpo- 
lates or extrapolates. ANNs are nonlinear computational structures, inspired 
by biological neural systems, which consist of interconnected group (layers) 
of artificial neurons (processing units). The connections of the units (weights) 
determine their function [13]. In particular, feedforward networks (multilayer 
perceptrons) on which we focus in this work although fully-connected have no 
lateral and feedback connections and the information flows from the input to 
the output. SVMs are learning systems having a rigorous basis in the statisti- 
cal learning theory developed by Vapnick and Chcrvonenkis [14] (VC theory) 
and belong to the class of Kernel methods (meaning that they implicitly per- 
form a nonlinear mapping of the input data into a high-dimentional feature 
space). There are similarities as well as differences between ANNs and SVMs. 
The differences have mainly to do with the tradeoff between complexity and 
generalization ability and with the "nature" of their parameters. During SVM 
training a reduced number of training patterns (support vectors) is picked and 
determines the architecture (optimal number of neurons) . 

In this work we have constructed four separate ANN and SVM models that 
determine the halflives of the nuclides according to the pairing of Z and N 
{even- Z -even- N (EE), even-Z-odd-A^ (EO), odd-Z-even-A^ (OE), and odd- 
Z-odd-N (00) ). We briefly list below the main features of these models and 
further information on the methodology is found in Ref. [9, 15] and [11, 16] 
for ANN and SVM models respectively. The four ANN models are fully- 
connected, multilayer feedforward networks with architecture symbolized by 
[2 — 5 — 5 — 5— 1|81]. The activation function of the neuron-like units is 
given by hyperbolic-tangent sigmoid function in the intermediate (hidden) 
layers and a saturated linear function in the output layer. The two input 
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units encode Z and and the single output unit log^pTg. The Levenberg- 
Marquardt backpropagation algorithm has been used to train the network 
while implementing Bayesian regularization and cross-validation to improve 
generalization and using the Nguyen- Widrow method for initialization of the 
network. The four SVM models use a p74-type Kernel (a composite kernel 
obeying Mercer's theorem and constructed linearly by combining the Anova 
(A) kernel and a polynomial (p) kernel) with the two "pA parameters p and d 
equal to 1 and 8 respectively and the control parameter C equal to 3.861 x 10''3 
for all four classes. 

The experimental data used in developing our models of /3-decay systematics 
have been taken from the iV-u6ase2003 evaluation [17]. Restricting attention 
to those cases in which the ground state of the parent decays 100% through 
the /5~ channel, we form a subset of the beta-decay data denoted by NuSet-A, 
consisting of 905 nuchdes. We also formed a more restricted data set, called 
NuSet-B, by ehminating from NuSet-A those nuclei having halfiife greater 
than 10^ s. The halflives in this subset range from 0.15 x 10~^ s for ^^Na to 
0.20 X 10^ s for 247pu. NuSet-B consists of 838 nuclides {Overall set): 672 (80%) 
{learning set) of them have been randomly chosen to train the ANNs or to 
find the support vectors in the case of SVMs; of those left 83 (10%) {valida- 
tion set) have been similarly chosen to validate the learning procedure in the 
case of ANNs or to guide the determination of small number of parameter, 
mainly entering the inner-product Kernel in the case of SVMs; the rest 83 
(~ 10%) {test set) have been similarly chosen to evaluate the accuracy of the 
prediction. Considering performance on the above sets we speak of operation 
in overall, learning, validation and prediction modes. Having excluded the few 
long-lived examples from NuSet-A, one is then dealing with a more homoge- 
neous collection of nuclides, a property that facilitates the training of network 
models. Accordingly, we have focused our efforts on NuSet-B, and since the 
examples still span 9 orders of magnitude in Tp, it is natural to work with the 
logio Tp. 



3 Results And Discussion 



The performance of our ANN and SVM global models is first evaluated in 
Table 1 by direct comparison with the experimental data using a commonly 
known statistical metric, namely the Root Mean Square Error (ctrmse): 

1/2 

(1) 



crmse 
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where i/i = log^^o ^/3,caic and jji = log^QT^^^exp and A'^ the total number of nu- 
dides in each case. We mention for comparison the ctrmse values of an earlier 
ANN model [12] which equal to 1.08 and 1.82 for total learning and test sets, 
respectively, as well as those of the ANN model developed recently by means 
of the whole basis [10], which equal to 0.53, 0.60 and 0.65 for the total learn- 
ing, validation and test sets respectively. In Table 2, a further assessment is 
made by tabulating the performance measures of Mollcr and collaborators [6] : 
M, its standard deviation ctm and E . These measures are defined as follows 
in terms of the variable Vi = yi/yf. 



1 
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N 



N 
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(2) 



Superior models should have M, ctm and E near zero. A comparison follows 
with two recent global theory-thick models of the above collaboration, namely 
the FRDM+pnQRPA and the pnQRPA+ffGT models. [6]. Finally, in Fig. 1, 
halflives of /^"-decaying nuclides that are found near or on a typical r-process 
path with neutron separation energy below 3 MeV derived by means of the 
present LM-modcls arc compared with those from pnQRPA+jfGT calculation 
and a calculation by Pfeifi^er and coworkers [18] (labeled GT*) based on the 
early Gross Theory (GT) of Takahashi et al. [2a] with updated mass values. 



From the above comparison one can conclude that LAf -based models give 
similar results, which are close to experimental data. In Fiq. 1, the results 
given by the SVM model are almost equal with the experimental values. 
This occurs because, unlike neuron network, where the connections between 
neurons are random values, SVMs use exclusively only almost all training 
patterns as "nodes", a fact that may lead to better interpolation but often 
worse extrapolation. Furthermore, the comparison of the results derived by 
the LM-modes with those of the models presented in Table 2 as well with 
others [15,16] lead to the conclusion that the former perform equally or better 
than the later. This is partially ascribed to the larger number of parameters 
of the LM-based models. Regarding the performance of the present ANN 
and SVM models with respect to that of previous LM-based models the 
consideration of different quality measures slightly favors the present ones. 
However, a more detailed analysis (see Ref. [15]) shows that the subdivision 
of the data into four {Z,N) parity classes can lead to spurious fluctuations 
in the prediction of lifetimes for nuclides of isotopic and isotonic chains. This 
favors the use of the ANN model developed recently by means of the whole 
basis [10]. 
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Table 1 

Root-mean- square errors a (= ctrmse) (Eq- 1) for learning, validation and test 
sets, achieved by the current ANN and SVM models (developed for even-even, 
even-odd, odd-even and odd-odd classes, in Z and N) of /?~-decay halfiives (with a 
cutoff at W^s). 



ANN models SVM models 





Lear. Set 


Val. Set 


Test Set 


Lear. Set 


Val. Set 


Test Set 


Class 


N 


a 


N 


a 


N 


a 


N 


a 


N 


a 


N 


a 


EE 


131 


0.36 


16 


0.41 


16 


0.62 


131 


0.55 


16 


0.57 


16 


0.62 


EO 


179 


0.38 


22 


0.44 


22 


0.39 


179 


0.41 


22 


0.42 


22 


0.51 


OE 


172 


0.44 


21 


0.46 


21 


0.53 


172 


0.41 


21 


0.47 


21 


0.47 


00 


190 


0.52 


24 


0.42 


24 


0.33 


190 


0.52 


24 


0.40 


24 


0.52 


Total 


672 


0.41 


83 


0.44 


83 


0.51 


672 


0.47 


83 


0.46 


83 


0.53 



Table 2 

Quality indices M, gm and S (Eq. 2) for the present ANN models in Overall 

(a) and Prediction (b) Modes and for the FRDM-hpnQRPA and pnQRFA+jfGT 
models of Ref. [6]. The number n stands for the nuclides with experimental halfiives 
below the prescribed limit. 





(a) ANNs 


- Overall Mode 


(b) ANNs 


- Prediction Mode 




n 


M 




S 


n 


M 




S 


< 1 


252 


0.02 


0.30 


0.30 


8 


-0.19 


0.61 


0.64 


< 10 


396 


0.02 


0.35 


0.35 


28 


-0.16 


0.57 


0.60 


< 100 


529 


0.03 


0.36 


0.36 


54 


-0.05 


0.51 


0.52 


< 1000 


653 


0.05 


0.39 


0.39 


77 


0.03 


0.49 


0.49 


< 10^ 


838 


0.00 


0.45 


0.45 


83 


0.03 


0.52 


0.53 


P,exp 


(c) FRDM 


+ pnQRPA [6] 


(d) pnQRPA + ffGT [6] 


(s) 


n 


M 




S 


n 


M 




S 


< 1 


184 


0.03 


0.57 


0.57 


184 


-0.08 


0.48 


0.49 


< 10 


306 


0.14 


0.77 


0.78 


306 


-0.03 


0.55 


0.55 


< 100 


431 


0.19 


0.94 


0.96 


431 


-0.04 


0.61 


0.61 


< 1000 


546 


0.34 


1.28 


1.33 


546 


-0.04 


0.68 


0.68 


< 10^ 
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Fig. 1. Halflives for /^"-decaying nuclides that are found near or on a typical r-pro- 
cess path with the neutron separation energy lower or equal to 3 MeV derived by 
means of the present ANN and SVM models are compared with the experimental 
data and those from pnQRPA+j(f GT [6] and GT* [18] calculations. 

4 Conclusion and Prospects 



Exp. Data [17] 
I I Am Models 
I I pnQRPA+mrr [6] 




In this work, the beta-decay halflives problem is dealt as a nonlinear optimiza- 
tion problem, which is resolved in the statistical framework of Machine Learn- 
ing using Artiflcial Neural Networks (ANNs) and Support Vector Regression 
Machines (SVMs). It seems that both ANNs and SVMs demonstrate sim- 
ilar performance and that our statistical large-scale calculations can match 
or even surpass the predictive performance of the best conventional global 
calculations outside the stable valley. Moreover, this way of confrontation of 
the beta-halflives problem could give an estimation of the degree that the 
nucleonic numbers determine the beta-decay systematics of a nuclear system. 
Accordingly, we plan further studies of the systematics of beta decay using 
LMs, with the object of continued enhancement of their predictive power and 
of possible gaining of some new physical insight. 
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